Reading � MAS962, Pustejovsky

Greg Detre

Monday, October 07, 2002

 

As Pustejovsky puts it, �the premise of the current work is that it is the lexicalisation strategies in a language which give us our first glimpse of the concepts behind our thoughts�. To this end, he attempts to formalise a �type constructional system of concepts�, by which he means a rich heterogenous hierarchy of categories and relations �within which we can construct increasingly complex types from a set of basic building blocks�.

His stated aim is to outline a methodology for constructing these types �based on the dual concerns of capturing linguistic generalisations and satisfying metaphysical considerations�. I�m not exactly sure what he means by this, but I think he is saying that they are attempting to bring together syntactic and psycholinguistic evidence with our intuitions about what constitute the cleanest high-level partitions in our minds, using a more structured categorisation methodology than, for example, Wordnet�s breakdown of the top level into more than 20 domains.

Pustejovsky�s system is based on his Generative Lexicon, a modern formalisation/conception of the human mental lexicon, embodying its complex, dynamic, compositional functionalities. To this end, Generative Lexicon Theory has at least four levels of linguistic representations:

the argument structure � this specifies the number and type of arguments attached to a lexical item

event structure � the state, process and transition of an event

it is assumed that events can be discretized somehow, but also broken down further into sub-events with sub-predicates

qualia structure � the different flavours that we can use to describe and bind together predicates

lexical inheritance structure - how a lexical structure is related to other structures

I�ll focus on the qualia structure, since this seems to be instrumental to his enterprise. It is very loosely based on Aristotle�s original four causes (formal, material, final, efficient), using the terms:

formal � this is the closest Pustejovsky comes to an �is-a� relation

constitutive � like Wordnet�s �meronymy�

telic � the purpose/function (�telos�)

agentive � how it originated

Using these, he is able to build a tree whose top levels look like this:

entity

natural

functional

complex

event

natural

functional

complex

quality

natural

functional

complex

He defines the main natural/functional/complex divisions in terms of the qualia structure, e.g. natural kinds can be described most purely without any reference to function (telic) or being devised (agentive). He builds up a sort of 2x2 matrix of categories based on these binary conditions of having-function and being-man-made, moving from natural to functional types. As far as I could tell, complex objects combine two or more natural/functional predicates by a �Dot Object Construction�.

The tree expands further into categories like:

entity

������� natural

��������������� physical

���������������������� count

���������������������� mass

��������������� abstract

���������������������� info

������� functional

��������������� direct

���������������������� e.g. coffee

��������������� purpose

���������������������� e.g. knife

������� complex

��������������� e.g. book

It wasn�t clear to me how we could derive these sub-sub-partitions using any �methodology� beyond the kind of intuitions that webring to bear as native speakers. Perhaps Pustejovsky would argue that the detailed discussion of operators and types can help us here, especially given a firm top-level grounding from which to work. Either way, it wasn�t clear to me whether the process could be systematised through some sort of statistical or machine-learning category-gathering approach, or whether a hand-coded approach like Wordnet�s would be called for. Given his emphasis on computational tractability, as well as his attempts to relate semantic information to syntactic form, I presume that he expounds methods for systematising the process elsewhere.

I am not entirely clear about how the system would adequately represent the whole gamut of human concepts. If we take concepts like �colour� or �number�, I can see that either of them could go somewhere in the tree like Entity/Natural/Abstract/Information, or under Quality/�, and I think Pustejovsky�s complicated operator system would allow us some means to tie these together. However, I don�t quite see how categorising them in this way, building their qualia structures out of the primitives, would ever quite be enough to capture their complete meaning to us. To put it another way, I don�t see how any number of compositions and type-coercions and the like would allow us to express mathematical concepts within the GL system. Perhaps �number� and certain other mathematical operations are a special conceptual subset, and deserve their own treatment elsewhere. But I still don�t see how we�d fully capture things like function words and linguistic relations, colours and basic experiential concepts or really any other fully-fledged, densely inter-connected yet self-contained human-level concept. As far as I can tell, these are fair criticisms, because Pustejovsky tell us that he is seeking a (fully) �satisfactory definition for category or concept, one which both meets formal demands on soundness and completeness, and practical demands on relevance to real-world tasks of classification�.

Personally, I still have major reservations about any system that attempts to describe concepts in terms of other concepts (even primitives), without a much heavier emphasis on cognitive and, in turn, perceptuomotor structures. I think Pustejovsky�s model can be criticised on these grounds, for the difficulty of systematically generating the ontology in the first place, for being based on our intuitions, and for being inevitably too clean in its delineation of concept boundaries. In part, I am criticising the idea that we can examine our sub-cognitive concepts by examining our linguistic concepts in isolation. There must be a strong relation and interaction between them, but it seems likely to me that language must add something extra to the information from our sub-cognitive modalities, while at the same time stripping away aspects of our perceptuomotor representations that can�t or don�t need to be communicated. This may be what Pustejovsky means by also capturing our �metaphysical considerations�, and I do think that (for example) his use of the telic and agentive qualia to create virtual types is ingenious, promising and useful in this regard. However, I am ultimately left with a feeling of dissatisfaction and a reluctance to believe that any such lexical system could ever capture our non-rigorous concepts in all their nebulous glory.

 

I found this summary/discussion of Pustejovsky�s views very useful:

http://www-users.cs.york.ac.uk/~mdeboni/research/generative_lexicon.html